GSCALER: Synthetically Scaling A Given Graph

نویسندگان

  • J. W. Zhang
  • Y. C. Tay
چکیده

Enterprises and researchers often have datasets that can be represented as graphs (e.g. social networks). The owner of a large graph may want to scale it down to a smaller version, e.g. for application development. On the other hand, the owner of a small graph may want to scale it up to a larger version, e.g. to test system scalability. This paper investigates the Graph Scaling Problem (GSP): Given a directed graph G and positive integers ñ and m̃, generate a similar directed graph G̃ with ñ nodes and m̃ edges. This paper presents a graph scaling algorithm Gscaler for GSP. Analogous to DNA shotgun sequencing, Gscaler, decomposes G into small pieces, scales them, then uses the scaled pieces to construct G̃. This construction is based on the indegree/outdegree correlation of nodes and edges. Extensive tests with real graphs show that Gscaler is scalable and, for many graph properties, it generates a G̃ that has greater similarity to G than other state-of-the-art solutions, like Stochastic Kronecker Graph and UpSizeR.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Diameter Two Graphs of Minimum Order with Given Degree Set

The degree set of a graph is the set of its degrees. Kapoor et al. [Degree sets for graphs, Fund. Math. 95 (1977) 189-194] proved that for every set of positive integers, there exists a graph of diameter at most two and radius one with that degree set. Furthermore, the minimum order of such a graph is determined. A graph is 2-self- centered if its radius and diameter are two. In this paper for ...

متن کامل

Dscaler: Synthetically Scaling A Given Relational Database

The Dataset Scaling Problem (DSP) defined in previous work states: Given an empirical set of relational tables D and a scale factor s, generate a database state D̃ that is similar to D but s times its size. A DSP solution is useful for application development (s < 1), scalability testing (s > 1) and anonymization (s = 1). Current solutions assume all table sizes scale by the same ratio s. Howeve...

متن کامل

Graph Hybrid Summarization

One solution to process and analysis of massive graphs is summarization. Generating a high quality summary is the main challenge of graph summarization. In the aims of generating a summary with a better quality for a given attributed graph, both structural and attribute similarities must be considered. There are two measures named density and entropy to evaluate the quality of structural and at...

متن کامل

A matrix-algebraic formulation of distributed-memory maximal cardinality matching algorithms in bipartite graphs

We describe parallel algorithms for computing maximal cardinality matching in a bipartite graph on distributedmemory systems. Unlike traditional algorithms that match one vertex at a time, our algorithms process many unmatched vertices simultaneously using a matrix-algebraic formulation of maximal matching. This generic matrix-algebraic framework is used to develop three efficient maximal match...

متن کامل

Estimating Perimeter Using Graph Cuts

We investigate the estimation of the perimeter of a set by a graph cut of a random geometric graph. For Ω ⊂ D = (0, 1), with d ≥ 2, we are given n random i.i.d. points on D whose membership in Ω is known. We consider the sample as a random geometric graph with connection distance ε > 0. We estimate the perimeter of Ω (relative to D) by the, appropriately rescaled, graph cut between the vertices...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016